17 research outputs found
Recommended from our members
Per-Core DVFS with Switched-Capacitor Converters for Energy Efficiency in Manycore Processors
Integrating multiple power converters on-chip improves energy efficiency of manycore architectures. Switched-capacitor (SC) dc-dc converters are compatible with conventional CMOS processes, but traditional implementations suffer from limited conversion efficiency. We propose a dynamic voltage and frequency scaling scheme with SC converters that achieves high converter efficiency by allowing the output voltage to ripple and having the processor core frequency track the ripple. Minimum core energy is achieved by hopping between different converter modes and tuning body-bias voltages. A multicore processor model based on a 28-nm technology shows conversion efficiencies of 90% along with over 25% improvement in the overall chip energy efficiency
Recommended from our members
A RISC-V Vector Processor With Simultaneous-Switching Switched-Capacitor DC-DC Converters in 28 nm FDSOI
This work demonstrates a RISC-V vector microprocessor implemented in 28 nm FDSOI with fully integrated simultaneous-switching switched-capacitor DC-DC (SC DC-DC) converters and adaptive clocking that generates four on-chip voltages between 0.45 and 1 V using only 1.0 V core and 1.8 V IO voltage inputs. The converters achieve high efficiency at the system level by switching simultaneously to avoid charge-sharing losses and by using an adaptive clock to maximize performance for the resulting voltage ripple. Details about the implementation of the DC-DC switches, DC-DC controller, and adaptive clock are provided, and the sources of conversion loss are analyzed based on measured results. This system pushes the capabilities of dynamic voltage scaling by enabling fast transitions (20 ns), simple packaging (no off-chip passives), low area overhead (16%), high conversion efficiency (80%-86%), and high energy efficiency (26.2 DP GFLOPS/W) for mobile devices
Recommended from our members
GAIL: The graph algorithm iron law
Copyright 2015 ACM. As new applications for graph algorithms emerge, there has been a great deal of research interest in improving graph processing. However, it is often difficult to understand how these new contributions improve performance. Execution time, the most commonly reported metric, distinguishes which alternative is the fastest but does not give any insight as to why. A new contribution may have an algorithmic innova- tion that allows it to examine fewer graph edges. It could also have an implementation optimization that reduces com- munication. It could even have optimizations that allow it to increase its memory bandwidth utilization. More interest- ingly, a new innovation may simultaneously affect all three of these factors (algorithmic work, communication volume, and memory bandwidth utilization). We present the Graph Algorithm Iron Law (GAIL) to quantify these tradeoffs to help understand graph algorithm performance
Recommended from our members
Locality exists in graph processing: Workload characterization on an ivy bridge server
© 2015 IEEE. Graph processing is an increasingly important application domain and is typically communication-bound. In this work, we analyze the performance characteristics of three high-performance graph algorithm codebases using hardware performance counters on a conventional dual-socket server. Unlike many other communication-bound workloads, graph algorithms struggle to fully utilize the platform's memory bandwidth and so increasing memory bandwidth utilization could be just as effective as decreasing communication. Based on our observations of simultaneous low compute and bandwidth utilization, we find there is substantial room for a different processor architecture to improve performance without requiring a new memory system
Recommended from our members
GAIL: The graph algorithm iron law
Copyright 2015 ACM. As new applications for graph algorithms emerge, there has been a great deal of research interest in improving graph processing. However, it is often difficult to understand how these new contributions improve performance. Execution time, the most commonly reported metric, distinguishes which alternative is the fastest but does not give any insight as to why. A new contribution may have an algorithmic innova- tion that allows it to examine fewer graph edges. It could also have an implementation optimization that reduces com- munication. It could even have optimizations that allow it to increase its memory bandwidth utilization. More interest- ingly, a new innovation may simultaneously affect all three of these factors (algorithmic work, communication volume, and memory bandwidth utilization). We present the Graph Algorithm Iron Law (GAIL) to quantify these tradeoffs to help understand graph algorithm performance
Recommended from our members
Per-Core DVFS with Switched-Capacitor Converters for Energy Efficiency in Manycore Processors
Integrating multiple power converters on-chip improves energy efficiency of manycore architectures. Switched-capacitor (SC) dc-dc converters are compatible with conventional CMOS processes, but traditional implementations suffer from limited conversion efficiency. We propose a dynamic voltage and frequency scaling scheme with SC converters that achieves high converter efficiency by allowing the output voltage to ripple and having the processor core frequency track the ripple. Minimum core energy is achieved by hopping between different converter modes and tuning body-bias voltages. A multicore processor model based on a 28-nm technology shows conversion efficiencies of 90% along with over 25% improvement in the overall chip energy efficiency